This is the Data Exloration for 2016 US Presidential Election for the state of Florida. Florida is generally marked as on the “swing” states, on which the result of the presidential election depends and where the 2 largest parties -> Republicans(‘Red’) and Democrats(‘Blue’) have similar support and result can go either way.
The DataSet comes from the Federal Election Commision
I am aiming to have some insights on the following quesitons ->
Let’s Begin by adding basic packages for analysis and loading the data.
In the DataSet there are 400K+ observations and 19 variables
## cmte_id cand_id cand_nm contbr_nm
## 1 C00580100 P80001571 Trump, Donald J. SELLAS, KRISTEN
## 2 C00580100 P80001571 Trump, Donald J. SELLERS, CHRIS
## 3 C00575795 P00003392 Clinton, Hillary Rodham SCHECTER, MITCHELL
## 4 C00577130 P60007168 Sanders, Bernard LERBS, CHRISTOPHER
## 5 C00575795 P00003392 Clinton, Hillary Rodham CHOREY, LORI
## 6 C00580100 P80001571 Trump, Donald J. PUSINS, AUDREY
## contbr_city contbr_st contbr_zip contbr_employer
## 1 CLEARWATER FL 33759 RETIRED
## 2 VERO BEACH FL 32966 PROST BEVERAGE COMPANY LLC
## 3 PLANTATION FL 333243808 TERRANOVA
## 4 GREEN COVE SPRINGS FL 320433443 NONE
## 5 TALLAHASSEE FL 323121800 SWEAT THERAPY FITNESS
## 6 BOCA RATON FL 33433 PBSO
## contbr_occupation contb_receipt_amt contb_receipt_dt
## 1 RETIRED 68.37 09-NOV-16
## 2 BEVERAGE INDUSTRY EXECUTIVE 80.00 19-NOV-16
## 3 CONTROLLER 15.00 22-APR-16
## 4 NOT EMPLOYED 50.00 05-MAR-16
## 5 FITNESS TRAINER 100.00 06-APR-16
## 6 LAW ENFIRCEMENY 76.89 02-DEC-16
## receipt_desc memo_cd memo_text form_tp
## 1 X SA18
## 2 X SA18
## 3 X * HILLARY VICTORY FUND SA18
## 4 * EARMARKED CONTRIBUTION: SEE BELOW SA17A
## 5 X * HILLARY VICTORY FUND SA18
## 6 X SA18
## file_num tran_id election_tp
## 1 1146165 SA18.145176 G2016
## 2 1146165 SA18.120235 G2016
## 3 1091718 C4746745 P2016
## 4 1077404 VPF7BKX6F26 P2016
## 5 1091718 C4682258 P2016
## 6 1146165 SA18.133502 G2016
Data is mostly catagorical with most of the features being text and number. There are certain varaibles like file_num, tran_id, memo_text which might not be usefull for our analysis, will consider removing them from data frame if needed.
Exploring Columns Unique Values for election type
## # A tibble: 5 x 2
## election_tp `n()`
## <fctr> <int>
## 1 1244
## 2 G2016 153073
## 3 O2016 54
## 4 P2016 271685
## 5 P2020 1
We can see that moslt of the contributions recorded are for Primaries of 2016 eleciton with around 150k of them for general election. We’ll filter the data based on this to actually see different data points for both the primaries and general election if needed. Checking out the number of cancidates that received contributions in the entire election, we’ll remove those candidates that received less than 1000 contirbutions.
## # A tibble: 10 x 2
## cand_nm `n()`
## <fctr> <int>
## 1 Bush, Jeb 6045
## 2 Carson, Benjamin S. 16074
## 3 Clinton, Hillary Rodham 184378
## 4 Cruz, Rafael Edward 'Ted' 29153
## 5 Fiorina, Carly 2056
## 6 Kasich, John R. 1342
## 7 Paul, Rand 2028
## 8 Rubio, Marco 20472
## 9 Sanders, Bernard 82523
## 10 Trump, Donald J. 78970
There is no column ‘Party’ in the data set to identify to which party did the candidate belong to.
Adding a new column ‘Party’ to dataset.
## # A tibble: 6 x 18
## # Groups: cand_nm [3]
## cmte_id cand_id cand_nm contbr_nm
## <chr> <fctr> <fctr> <fctr>
## 1 C00580100 P80001571 Trump, Donald J. SELLAS, KRISTEN
## 2 C00580100 P80001571 Trump, Donald J. SELLERS, CHRIS
## 3 C00575795 P00003392 Clinton, Hillary Rodham SCHECTER, MITCHELL
## 4 C00577130 P60007168 Sanders, Bernard LERBS, CHRISTOPHER
## 5 C00575795 P00003392 Clinton, Hillary Rodham CHOREY, LORI
## 6 C00580100 P80001571 Trump, Donald J. PUSINS, AUDREY
## # ... with 14 more variables: contbr_city <fctr>, contbr_st <fctr>,
## # contbr_zip <dbl>, contbr_employer <fctr>, contbr_occupation <fctr>,
## # contb_receipt_amt <dbl>, contb_receipt_dt <fctr>, receipt_desc <fctr>,
## # memo_cd <fctr>, memo_text <fctr>, form_tp <fctr>, file_num <int>,
## # tran_id <fctr>, election_tp <fctr>
Zip codes are huge numbers, extracting first 5 digits for standardization
## # A tibble: 6 x 19
## # Groups: cand_nm [3]
## cmte_id cand_id cand_nm contbr_nm
## <chr> <fctr> <fctr> <fctr>
## 1 C00580100 P80001571 Trump, Donald J. SELLAS, KRISTEN
## 2 C00580100 P80001571 Trump, Donald J. SELLERS, CHRIS
## 3 C00575795 P00003392 Clinton, Hillary Rodham SCHECTER, MITCHELL
## 4 C00577130 P60007168 Sanders, Bernard LERBS, CHRISTOPHER
## 5 C00575795 P00003392 Clinton, Hillary Rodham CHOREY, LORI
## 6 C00580100 P80001571 Trump, Donald J. PUSINS, AUDREY
## # ... with 15 more variables: contbr_city <fctr>, contbr_st <fctr>,
## # contbr_zip <chr>, contbr_employer <fctr>, contbr_occupation <fctr>,
## # contb_receipt_amt <dbl>, contb_receipt_dt <fctr>, receipt_desc <fctr>,
## # memo_cd <fctr>, memo_text <fctr>, form_tp <fctr>, file_num <int>,
## # tran_id <fctr>, election_tp <fctr>, party <chr>
Analysing some data points related to date of contribution, party to which contributions is done, candidate wise contribution.
Starting with Date of Contribution or to get more meaningfull numbers, we’ll see how many days before day of election (8th Nov 2016)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -53.0 69.0 147.0 174.4 254.0 1134.0
We can see that Most people donated ~5Months before the election, just after the primaries, when the campaigning was in full swing. We are getting the minimum value as -53, it’s most probably an outlier, someone donated after the election was over.
Let’s see the trends for date of contrubtion
From the plot we can see that while there was some hike in contributions around the primaries (244-300 days before election).Many people donated just around 100 day martk when the election campaign was going on, there’s a spike in the contributions just before election day, this can be attributed to final surge of campaigning being done by both the candidates.
The trend can also been seen, as just before the primaries, there was a surge of contribution and just after that the contributions dropped below.
First checking the total contributions for both the major parties
As we can see despite the fact that republican candidates were much more compared to Democrat candidated the number of contributions for Democrats are much higher. We’ll asses the value of these contributions in the next section
Surprisingly hillary Clinton received the most number of contributions in the state of Florida. From the republican side Jeb Bush and Marco Rubio were Local candidates and even they received less contribution.
This might be as Jeb Bush dropped out of the race before the Florida Primary and Marko Rubio did not emerge as Republican candidate at the end of primaries. ### Contribution percentage per candidate
Let’s see how the contribution for each candidate was as a portion of the all the number of contributions. Also let’s analyse how it was different for both Primaries and General Election.
Starting with Primaries (Filtering out candiates who had really less contributions)
We can see that Hillary Clinton and Bernie Sanders had most of the contributors, with Trump from the Republican side having maximum contribution
Let’s see how it changed for the general election, where only 2 candiates were there
It seems Hillary Clinton had overwhilmigly more contributors that Donald Trump in the General Election.
It’s interesting to note that even for General Election people contributed to other candidates which had dropped out of the race.
Another interesting point is that even though Hillary Clinton had more number of contributors, she ended up losing in the Florida Elections
Lets see in which of the election Primarires or General there were more number of contributions
We can clearly see that contribution for primaries were much more compared to General Election
Let’s do further analysis on election data set.
Let’s see the contribution by seprating the democrats and republicans.
Looks like the number of contributions for republicans varied by a large amount in general vs primaries, the count of contribution significantly decreased in the general election while it was much more consistent for democrats.
It’s worth noting that in the general election, republican candidate Donald Trump emerged as the victorous.
Let’s see how much did each candidate received.
We can clearly see that Donald Trump from Republican side and hillary Clinton from Democrat side received the highest contributions.
## # A tibble: 6 x 3
## # Groups: cand_nm [6]
## cand_nm party avg_fund
## <fctr> <chr> <dbl>
## 1 Bush, Jeb Republican 1093.95272
## 2 Carson, Benjamin S. Republican 113.98395
## 3 Clinton, Hillary Rodham Democrat 118.87217
## 4 Cruz, Rafael Edward 'Ted' Republican 97.68068
## 5 Fiorina, Carly Republican 208.08759
## 6 Kasich, John R. Republican 566.98261
We can clearly see that Jeb Bush had much higher contribution per person compared to others, this is primaroty due to the fact that few of the contributions for Jeb Bush were extraodinarly high. while Hillary clinton had maximum amount of contributions as we saw earloer, we can see that the average contribution was much less
It is looking more and more like it is the number of contributions that count rather than the amount of each contribution.
How did the contribution amount varied with party - having binwidth as 10 and limiting the amount to <1000 as that were most of the contributions
Having binwidth as 5 and limiting the amount to <300 to see some variations party wise
We can see that the contribution amount was generally higher for republicans than it was for democrats
Represendintg the same in box plot, to see if we can gather some more insights
The Median amount for republicans is higher than that of democrats.
Let’s see if this holds tru if only consider the general election.
Even in General Electioncontribution for Republican was higher than that of democrat.
In fact when comapring the above 2 plots, we can see that republican increased for the general election, while democratic contribution remained mostly same. While the total contribution for Hillary were still more than that of Donald Trump, If we are considering only general election, donation for republican is more than that of democrats. This is telling as final result for the state of Florida was Donald Trump winning in the General election. Florida being a swing state, played an important role in the election outcome. ### How was the contribution distributed for candidates when measuring with days from election
We can see that arount 300 days from election, around the priamries, each candidate had jump in contributions.
Similarly towards the end, Donald Trump and Hillary Clinton had a jump in contributions, with there being a particualrly large spike for clinton.
Let’s see if we can draw any conclusion from what were the occupations of the highest contributors
We can see that for both parties, Retires had the maximum contribution, particularly for Republicans a lot of the Contributions from Retired Individuals
We can even see how the range of contributors varied for Republicans and Democrats.
While for Republicans majority of donation came from Retires, homemakes, contractors, business. For Democrats apart form Retires, Lot of Not Employed people donated ### Cumilitive Contribution by candidate with Date
Lets analyse the contribution distribution for different candiates and how it varied with time.
We can see that the rise and fall in the contiruvtions to both party candidates was consistent in how it rise and fell across the time period of last 100 days of election.
Few interesting points to note here is that Initially contributions for Donald Trump were huge compated to Hillary Clinton, but after that it was fairly consistent, with Hillary Finishing strong in the end.
Comparison of average contribution to Democrat and Republican month on month basis
This is a great reperesentation showing how the average contribution varied with time for Democrats and Republicans. We can see that there was a huge dfference between donations for republicans and democreats, with that of republicans being much more. Although the difference was really less towards the end of the race. There never was point in which contributions to democrats soured above that of reoublicans.
The one negetive point in the representation can be considered an outlier, as it is most likely related to return/rearrage of funds after several candidates dropped out of the race near the primaries.
Here we can see that variation of how the average contribution for each cnadidate changed over the course of 1 year of election campaign.
Now We’ll see how much contribution came from different cities First lets plot where the most amount of contribution came from with total funds from that city > 50000
Lots of varaiation from city to city
## # A tibble: 1 x 3
## # Groups: contbr_city [1]
## contbr_city party cand_funds
## <fctr> <chr> <dbl>
## 1 MIAMI Democrat 2278152
We can see the top contributor is Miami with total funds being 2,238,152
When analysing top 1% contributor, we see much less variation.
For visualizing th maps we’ll leverage the maps package For using maps package we need to convert the zipcodes to lat/long, using ‘zipcode’ package
We can see that the contribution are fairly well divided and coming from all over the State.
Now lets see thi candidate wise, From where did which candidate got maximum contribution
Looks like Donald Trump Got contribution from everywhere in the State.
Let’s divide this data set further only for General Election and compare democrats and republicans
Although we know that democrats got more contirbutions than republicans.
Although for party’s the contributions were more or less evenly spread out, It looks like Many of republican’s contribution came from certain strong holds, and hence it is looking like more donations came for Republicans.
In this section, we brush up and analyse the best looking and most informative plots we discovered above.
This Shows how the contirubtions varied for each candidate. Most contributions went to few candidates with Hillary Clinton, Donald Trump and Bernie Sanders being the top Candidates for receiving most contributions.
When plotting the entire ampaign contribution data for Demorats and republicans, We can see that while contribuion for democrats were evenly distributed. For Republicans the the bulk of the contribution came from specific areas, especially to the west of Florida.
We earlier saw how the Contribution amount varied for the candidates over the last 100 days. This plot indicates how the Average contributions to candidates varied over the 1 year before the election, and how it increased in particalur for Donald Trump in the end.
I encountered few issues while doing this analysis, primarily inadequate data Particaularly gender and income data
I beleive if we had contributor age/gender data available as well, it would have genereated interesting data points and lot could have been analysed based age/gender and the candidate to which major age gorup / gender reported.
It would have been also interesting to see the contributions with respect to mean income for that city and draw analysis on that
By looking at the donation data we can catch a glimpse how the candidates poled during the primaries and general election.
It is interesting to note that democrats got more number of contribution in the state of florida with Hillary Clinton getting maximum amount of contribution. And even for General election there were much less contributions for Republicans. Still Donald Trump, the republican candidate won the state in the election.
I was able to answer most the quesiton posed by me before the analysis.
We can do even further analysis in particular the spending analysis leading up to the general election from each candidate (Hillary Clinton and Donald Trump) and see if data combination of campaign contribution and spending had any corelation to the ultimate result i.e. Donald Trump winning the state of Florida.
I would love to club this data with Florida Election Watch and then draw even more analysis on the cobined data set.
1.http://www.datacarpentry.org/dc_zurich/R-ecology/05-visualisation-ggplot2.html
3.https://uchicagoconsulting.wordpress.com/2011/04/18/how-to-draw-good-looking-maps-in-r/